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Introduction 

The Division of Operations Analysis of the U«S« Office of Education 
is engaged in a continuing effort to develop models of various aspects 
of the educational system. One major effort with which I have had the 
pleasure to be associated ^ is the development of models of student 
achievement. It is intended that these models would be useful in 
providing guidelines for existing programs or indicate where new 
programs might be useful. 

D ata Ba se 

We have been working with a number of data bases but most of our 
work has utilized a body of data obtained from the Educational Opportuni- 
ties Survey. This survey entailed the testing and surveying of about 
650,000 students in some 4,000 public schools throughout the country in 
grades 1, 3, 6, 9, and 12 together with their teachers, principals and 
superintendents. The Survey sample consisted of a 5 percent sample of 
schools. The data base is comprehensive in that detailed factual and 
attitudinal information was collected on the students’ home background, 
attitude towards school, race relations and the world. A battery of 
ability and achievement tests was administered at each grads level. 
Information was collected from the teachers and principals concerning 
their training and experience, their view of the school, etc. The final 
part of the teacher questionnaire consisted of a 30 item contextual 
vocabulary test which was intended to be a measure of the verbal facility 
of the teacher. In addition, the principal provided data on the school's 
facilities, staff, programs, curricula, etc. A report investigating the 
Equality of Educational Opportunity for various racial and ethnic groups 
was presented to the Congress under the principal authorship of James S. 
Coleman. This report, which has become known as "The Coleman Report", 
contains detailed information on the design of the survey and I will refer 
you to that report for further details (Coleman et al, 1966). 

I would like to dv/ell now on some of the things we have been trying 
to accomplish using this data base. 



* The author is indebted to his many colleagues in the National Center 
for Educational Statistics for their helpful assistance through all 
phases of this study. This paper was presented at the U.S. Office of 
Education Symposium: Operations Analysis of Education, Washington, D.C. 

November 20-22, 1967. 



Research Strategy 



Estimation of Hissing Data 

The main goal of the analyses we have been doing was to reduce the 
more 40O variables in an empirically meaningful way into indices and 

sets of indices, so that the volume of data processing and complexity of 
later analyses could be reduced. Thus it was hoped that the regression 
equations would be more sharply defined if things that seemed to go to- 
gether both empirically and on the basis of their content were first grouped 
together so tliat what they had in common could make a more clear cut con- 
tribution. Earlie'r experience with these data showed that when each school 
facility such as a library or gymnasium was kept separate it might make a 
very small positive contribution to school achievement. It was also 
planned to conduct systematic or explanatory between school, within-school 
and total regressions for various combinations of variables. By e 3 q>lanatory 
regressions is meant that various combinations of subsets of variables would 
be entered into the regressions to see which sets would help to e 3 q>lain the 
predictable variance in achievement. 

Before the variables could be reduced into meaningful groupings how- 
ever, decisions had to be made concerning the estimation of missing data 
and the coding or scaling of variables. As a guide in the estimation of 
missing data or handling of non-responses, it was decided to analyze the 
responses to each question against one or more criteria or dependent 
variables so that not only the percent responding to each item or response 
alternative , but also their mean score on the dependent variable could be 
used as a guide in coding the variables and in assigning a value to the 
non-re spondent s . 

Since the approach differed somewhat for the student, teacher and 
principal questionnaires each analysis will be described separately. The 
various steps that we went through are given in Table 1. 

A factor analysis was conducted on the inter correlations of the five 
ninth grade achievement measures. These measures were; General Informa- 
tion, Reading Comprehension, Verbal Ability, Mathematics Achievement and 
Non-Verbal Ability. The factor analysis showed that a single factor could 
be used to describe the intercorrelations of these achievement measures 
(Mayeske and Weinfeld, Technical Note Number 21). Accordingly, the 
weights from the first principal component of the intercorrelations were 
used to weight scores on the individual tests and sum them to obtain an 
overall achievement composite. It was this achievement composite whlch^ 
was used as a criterion against which item responses were analyzed. This 
achievement composite is also the dependent variable for many later 
analyses. 
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TABLE 1 

I 

Sequsnce of Steps Entailed in Data Analysis and Reduction 



Student Variables 



School Variables 



Teacher Variables Principal Variables 



Develop Achievement 
Composite 



Analyze variables 
against achievement 
composite scores 



Criterion scale 
variables 



Correlate variables 
and factor analyze 
for indices 



Analyze variables 
against teacher's 
verbal score 



Scale variables 



Correlate variables 
and factor analyze 
for indices 



Analyze variables 
against school size, 
rural-urban and socio 
economic status^ and 
principal's salary 



Scale variables 



Correlate variables 
and factor analyze 
for indices 







. In order to maximize the linear relationship of each student variable 
with student achievement, criterion scaling was employed. By criterion 
scaling is meant that each item response was coded or scaled by assigning 
the mean value of the dependent variable for each of the different 
response alternatives for an item. Table 2 shows the criterion scale 
analysis for the categorical variable of "Father »s Occupation." The 
reader will note the percent of 9th grade students responding to each item 
alternative and their mean score on the achievement composite, where the 
total responses for each item have been set to a mean of 50 and a standard 
deviation of 10. When the mean value of the dependent variable is assigned 
as the code or scale value for each item alternative the items or variables 
are said to be criterion scaled . Almost all of the 9th grade student 
variables were coded in this manner (Weinfeld et al. Unpublished Manuscript 
Number 60). 

i For the teacher variables, each item was analyzed against the teacher* s 
-total score on a self -administered contextual vocabulary test (Mayeske et al. 
Technical Note Number 32). For the principal variables each item was 
analyzed against the number of students enrolled in the school, the rural- 
urban and socio-economic status of the school, and the principalis salary 
(Mayeske et al. Unpublished Manuscript Number 6l). These analyses were used 
as guides in assigning codes or scale values and in estimating missing data. 

. However, for the teacher* s and principal* s questionnaires the items were 
not coded so as to maximize their relationship with these dependent or 
criterion variables. 

Induction of Variables 

The intercorrelations of the student, teacher and principal sets of 
variables were each subjected to a series of factor analyses. The objective 
of these analyses was to obtain meaningful groupings of variables. To 
accomplish this objective a large number of subsets of the variables were 
each subjected to Principal Components analyses and Varimax rotations 

1965 )• The Principal Component method has the desirable property 
that it extracts the roots and associated factors in descending order of 
magnitude. Hence the first root is the largest, the second root the next 
largest, etc. Factors with a root of one or greater were subjected to a 
Varimax rotation. This is a technique for rotating the principal factors 
into a position that may be meaningful. It attempts to maximize the high 
and low weights for a factor so that the variables that have high weights 
on a factor can be thought of as belonging together and an interpretive 
label might be applied to what they have in common. 

This approach was essentially iterative in that variables that did 
not form meaningful groupings or blurred an otherwise meaningful grouping 
were eliminated and the remaining variables were refactored. The teacher 
and student variables readily fell into meaningful groupings after two 
iterations which resulted in the elimination of about six to twelve vari- 



"• 1 r r I - -rTnii iii|gr<rivr--iniifit-i>‘irTti(iiWi'~liiifc*rrii 
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TABLE 2. “Percent of 9th Grhde Students and Their Average Composite 
Achievement Score Classified by Father’s Occupation 

COMPOSITE 



1 


Technical 


2.8 


52.674 


10.328 


2 


Official 


4.1 


52.299 


10.226 


3 


Manager ^ 


12.6 


53.451 


9.160 


4 


Semi-skilled 


16.6 


50.060 


9.119 


5 

* 


Salesman 


4.3 


53.877 


8.898 


6 


Farm or ranch manager or 
owner 


3.8 


50.397 


10.250 


7 


Farm worker 


2.4 


43.316 


9.405 


8 


Workman or laborer 


10.5 


48.657 


8.897 


9 


Professional 


6.7 


56.597 


9.368 


10 


Skilled worker or foreman 


20.1 


51.000 


8.779 


11 


Don’t know 


10.8, 


43.057 


8.847 


0 


Non-response 


5.2 


42.599 


10.365 



TOTAL 



100 . 00 ** 



50. 000 



10.000 



*When the mean value is assigned as the code for tliat alternative the 
variable is said to be criterion scaled . 

*^ased on 133>136 ninth grade students. 



•I I • 
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ables from each set. The highest weights from the Varimax rotation v/ere 
used to multiply the variables by to obtain index scores. In order to 
keep the index score intercorrelations low a variable was allowed to have 
a weight on only one index. 

The variables from the principal questionnaire dealt with a wide 
variety of different aspects of the school. These variables did not 
readily fall into any naturally meaningful groups. Consequently, a priori 
groupings, such as variables concerned with the physical plant or instruc- 
tional facilities were subjected to a Principal Component analysis. The 
weights from the first principal component were then used to obtain index 
scores for each school. , 

Description of Indice s 

Pages 7 through 9 give a brief description of the indices obtained 
and other variables retained for future analyses. A detailed description 
of the development of these indices is given in the list of references 
(see Mayeske et al. Unpublished Manuscripts of Correlational and Factorial 
Analyses). 

When the full set of school variables is referred to later on, this 
reference will pertain to the combined set of teacher, principal and 
school indices and variables that are listed on pages 8 and 9. 

Using these indices we are currently conducting systematic between- 
school, within-school and total analyses using correlational and regression 
techniques. In this paper I would like to focus on our most complete set 
of analyses. These analyses use ninth grade schools as the unit of analysis. 
Thus when we speak of Socio-Economic Status we are talking about the average 
of the socio-economic index scores for the ninth grade students in a 
particular school and when we speak of Achievement we are talking about the 
average achievement of the ninth grade students in a school. In a similar 
manner we are talking about the average Experience or Training of the 
teachers in the school. There were approximately 923 schocls used in these 
analyses. 

Discussion of Zero-Order Correlations 



Although oiir primary interest was in factors that contribute to 
school achievement we felt that m.any of the other student indices such as: 
E^ectations for Excellence, Attitude Toward Life, Educational Desires 
and Plans, and Study Habits could also be regarded as being influenced by 
the school. Consequently we included these indices as dependent variables 
in addition to the Achievement Composite. 
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Student Indice s 

1. Expectations for Excellence - student believes that his mother, father 
and teacher want him to be a good student and he desires to be a good 
student, 

?. Socio-Economic Status - defined by mother »s and father »s educational 
level, father* s occupational level, rooms in the home, number of sib- 
lings, reading materials and appliances in the home and urbanness of 
background, 

3. Attitude Toward Life - a student with a high score on this index 
believes that people like himself have a chance to be successful, 
when he tries to get ahead he won*t experience many obstacles, hard 
work is mors important than good luck for success, won*t have a hard 
time getting a job with a good education, etc., 

4. F^ily Structure and Stability - a student with a high score has both 
his father and mother in the home, father is the major source of in- 
come, he hasn*t changed schools recently, etc., 

5. Educational Desires and Plans - a student with a high score desires 
and plans to go to college, his parents want him to go to college and 
he has high occupational level aspirations, 

6. Study Habits - a student with a high score spends about 2 hours a day 
studying, has frequent discussions about his school work with his 
parents, was read to as a child before he started school, read many 
books during the summer, etc., 

7. Racial-Ethnic Differences in Achievement - a variable created by 
assigning each student the average achievement score obtained by his 
racial or ethnic group. 



!• Experience - ooi^riBed of the teacher* s age, years of teaching ex- 
perience and years of teaching in his present school, 

2. Teaching C2ondltians - comprised of various aspects of the teacher's 
view of his teaching situation such as how hard the students try to 
achieve, their academic ability, the reputation of the school and 
student disciplinary, racial, etc. problems, 

3. Localism of Background - a teacher with a high score has spent most of 
his life in a small geographic area and has gi'aduated from high school 
and college in that locale, 

4* Socio-Economic Baokgrouid - conprised. of the teacher's parent's educa- 
tional level, father's occupation and rural-urbanness of their back- 
ground, 

5. Training - ccmprised of the teacher's hipest degree held, certification, 
salary level and tenure, 

6. CoHe.fje Attended - con^rised of the kind of undergraduate institution 
attended (eg. normal school, public or private university, etc.) the 
highest degree offered by that institution and the teacher's rating of 
the academic level of the institution, 

7. Teaching Related Activities - comprised of the hours of unofficial time 

spent in preparation for class and counseling, the number of 

educational journals read regularly, etc., 

« 

8. Preference for High Ability Students - teacher prefers to work with 
students of higher ability, socio-economic status, etc., 

9. Sex - scored high for a female, low for a male, 

10. Racial-Ethnic Differences in Contextual Vocabulary - a variable created 
by assigning each teacher the average vocabulary score obtained by his 
racial or ethnic group, 

11. Vocabulary Score - total number of items correct. 
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Princ ipal tna School IndlQfg 

1. . Principal* 8 Experieno® - ocanprised of age, maaber of years experience 

as a principal, etc., 

2. Principal* 8 Training - comprised of the highest degree held and salary 
level, 

3. Principal* 8 College Attended - same as teachers index, 

4« Principal* s Sex - a variable scored high for female, low for a male, 

5. Plant and Physical Facilities - area of plant, possession of auditorium, 
gymnasim, etc., 

6. Instructional Facilities - special labs, shops, volumes in the library, 

etc., ^ 

7. I^eoialized Staff and Services - art, music and remedial reading 
teachers, etc., 

8. Tracking - use of various kinds of ability grouping techniques, 

• 9* Testing - frequency of different kinds of testing, 

10. Transfers - number of students transferring in and out, 

11. " Remedial Programs - percent of students in remedial math and reading, 

12. Free Milk and Lunch Programs - percent of students who get free milk 
and lunch, 

13. Accreditation - whether or not school has state and regional accredita- 
tion, 

14. Age of Texts - age of different texts used> 

15. Availability of Texts - whether or not free texts are provided and if 
there is a sufficient number available, 

16. Age of Building - a variable, 

17. Pupils per room - a variable, 

18. Pupils per teacher - a variable, 

19. Number of students enrolled in the school, 

20. School Reputation - the principal* s estimate of the school* s reputation. 






I «ii' tmum-i 



• % 
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Ib •.'t'tdip'bjjiig to ftioortAln the Infloenoe ot scdbyool Tuxieblei ob 
aohieveiaent <»« lawit flrit take into aooonnt op equate eoboole for dlffer- 
enoee in the kinds of students that they get initially. Thus if sohool 
I had children primarily from families where intellectual activities were 
not valued or pursued and sohool B had children from families where these 
activities were valued and pursued then one would e 3 q>ect the students in 
sohool B to have higher achievement levels than students in sohool A. 

These differences oould be attributed to the influence of the different 
families rather than to the schools. Thus it would seem fitting and 
appz*opi*iate to equate schools for diffei*enoes in the home background 
racial-ethnic oom^wsition of their students before looking at the influence 
of sohool variables on achievement. By heme background we will mean the 
student indices of Sooio-Booncmio Status and Family Structure and Stability 
and for racial-ethnic composition we will uee the student Raoial-Bthnio 
difference variable. 

Before we control for the combined effects of these variables using 
multiple regression techniques it may be instructive to look at the 
correlations of these variables with -one another and with the dependent 
variables of interest. These are given in Table 3. 

The reader will note in looking at the first three rows in Table 3 
against column 8 which is the Achievemmit column^ that at least one and 
usually more than one of the three variables that w© are going to use to 
equate schools for differences in student inputs, are hi ghly correlated 
with Achievement as well as with the other dependent variables. This 
suggests that after equating schools for these initial differences there 
may be very few differences among schools in achievement that could be 
related to other school variables. This reasoning is also supported by 
reading across row 9 in Table 3. This row contains the multiple 
correlation of the full set of 31 school variables with each of the other 
variables. This row shows that the school variables are moderately to 
highly correlated with each of the other variables. 



Multiple Correlations 

Table 4 shows the squared multiple correlations obtained when the 
dependent variables are regressed against the three control or equating 
variables of Socio-Economic Status, Family Structure and Stability and the 
Racial-Ethnic Composition of the student body. Looking across row 1 of 
that table we see that achievement is the most highly predictable of the 
dependent variables from the student body variables, having a squared 
multiple correlation of .8<3 or a multiple correlation of about .91. We 
might ask why school achievement should be so highly predictable using 
these three variables? One inteipretation is that these results reflect 
the current social organization of our school systems. Thus schools 
are organized along residential lines and residental areas are in turn 




ntercorrel&tions of Inde^iident and Dependant Variables 
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TABLE 4 

Squared Multiple Correlations of Dependent Variables Against 
Student Bodj Variables and School Variables 



Attitude Educational Study 



Variable Set ] 


toectations 


Toward Life 


_ 


Jiabits 


AohievnBent 


1. Student Body 


.5214 


.5847 


.6066 


.7373 


.8207 


2. Sohool 


.1773 


.3500 


.3179 


.2023 


.7601 


3. Student Body and 
School 


.6309 


.6386 


.6679 


.7773 


.8662 


4. (3) - (1) 


.1095 


.0539 


.0613 


.0400 


.0455 


5. (3) - (2) 


.4536 


.2886 


.3500 


.5750 


.1061 



organized along socio-economic and racial-ethnic lines. This line of 
thought is fiirther supported by some of our analyses of individual students 
when they are not aggregated by schools. These analyses showed that 
individual student achievement was moderately predictable from the students* 
Socio-Economic Status, Family Structure, and Racial-Ethnic group membership 
(the multiple correlation being. .60) (Mayeske. et al. Unpublished Manuscript 
Niimber 80). One can infer that some kind of a sorting process is going on 
whereby white students with higher achievement and socio-economic status 
go to schools with similar kinds of students which has the effect of making 
their aggregated school achievement more predictable than individual achieve 
ment. 



If we are willing to grant that this sorting process takes place then 
what can we say about the effects of school variables in such a context? 

Row 2 of Table 4 shows the squared multiple correlations of the school 
variables with the dependent variables. It’s clear from this table that 
all of the dependent variables are more predictable using the student body 
variables than using the school variables. By comparing the values in 
row 3 with their counterparts in row 1 we can get some idea of the additional 
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oontributlon of tbs scthool varXaibles to the dependent variables > and bj 
ooa^paring the values in row 3 with their counterparts in row 2 we can get 
some idea of the additional contribution of the student body variables. 

These differences, often called the unique variance or contribution, are 
given in rows 4 end 5. Ixaminaticm of the values in row 4 indicates that 
the relative contribution of the school variables after family background 
and racial composition have been controlled for, are small but positive 
for all the dependent variables. Ebcamination of the values in row 5 
indicates that the relative contribution of the student body variables 
after school variables have been controlled for are moderate to large 
except for Aohiev^oent. 

Since the relative contributions of the school variables are small 
does it mean they are unimportant? Not necessarily, for as we showed 
earlier, the school variables tend to be bound up with the student body 
characteristics, and this is particiilarly so for Achievement. Might we 
then be able to develop an e:xpresslon of this commonness or overlap? 

We are Indebted to Dr. Alex Mood for developing a measure which will allow 
us to express this commonality. 

^ Commonality t A definition of this measure of commonality is given 
below. 

Let: C (B, S) stand for conrnonality or overlap of the student body variables 
(B) and the school variables (S) 

(B) - the squared multiple correlation of the student body variables 
with the dependent variable 

(S) - the squared multiple correlation of the school variables with the 
dependent variable 

R^ (B, S) - the squared multiple correlation of the student body and school 
variables with the dependent variable 

U (B) = R^ (B, S) - r 2 (S) - the unique contribution of the student body 

variables 

U (S) = R^ (B, S) - R^ (B) - the unique contribution of the school variables 

2 o 

Then C (B, S) = R (B, S) - U (B) - U (S) and (S) can be expressed as: 

I? (S) = C (B, S) + U (S) and 
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Table 5 gives the squared multiple oorrelaticjns of the school variables 
with the dependent variables when they are expressed as a function of their 
unique contribution and their coiunionality coefficient with the student 
body variables. 



TABLE 5 

The Squared Multiple Correlations of the School Variables With 
the Dependent Variables Expressed as a Punction of Their Unique 
Contribution and Their Conimonality Coefficient With the Student 

Body Variables 



Eiqpectations 
Attitude Toward Life 
Educational Plans and Desires 
Study Habits 
Achievement 



r2 (S) 


— 


C (B, 3) + 0 (3) 


.1773 


r: 


.0678 


+ .1095 


.3500 


s 


.2961 


+ .0539 


.3179 


— 


.2566 


+ .0613 


.2023 




.1623 


+ .0400 


.7601 


xz 


.7146 


+ .0455 



In looking at the list at Table 5 we note in the first column that 
achievement is the most predictable of the dependent variables from the 
school variables. Next, in descending order are, Attitude Toward Life, 
Educational Plans and Desires, Study Habits, and Expectations. When we 
look at* the commonality coefficient C (B, S), wa note that almost all of 
the variance in achievement predictable from school variables is bo\md up 
in the. student body-school overlap. Although the level of predictability 
is lower this same trend holds for Attitude Toward Life, Educational Plans 
and Desires and Study Habits. The school has its greatest unique contribu- 
tion for Expectations and less so for the other variables. 

Table 6 gives the squared multiple correlations of the student body 
variables with the dependent variables when they are expressed as a 
function of their unique contribution and their commonality coefficient 
with the school variables. 
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TABLE 6 



The Sqiiared Multiple Correlations of the Student Body Variables 
With the Dependent Variables Expressed as a Function of Their 
Unique Contribution and Their Conmonality Coefficient With the 

School Variables 



• 


r2 (B) = 


C (B, S) + 0 (B) 


Expectations 


.5214 = 


.0678 


+ 


.4536 


Attitude Toward Life 


.5847 = 


.2961 


+ 


.2886 


Educational Plans and Desires 


.6066 = 


.2566 


+ 


.3500 


Study Habits 


.7373 = 


.1623 


+ 


. 5750 


Achievement 


.8207 = 


.7146 


+ 


.1061 



In looking at Table 6 we can note, again in the first column, that 
Achievement is the most predictable of the dependent variables from the 
student body variables. Next, in descending order are. Study Habits, 
Educational Plans and Desires, Attitude Toward Life and Expectations. 

The student body variables have their greatest unique contribution for 
Educational Plans and Desires and their smallest unique contribution for 
Achievement. 

In view of the small unique contribution of the school variables does 
this mean that they are unimportant or have little influence? No, it does 
hot. What it does indicate is that it is very difficult to specify just 
how Influential these variables might be in bringing about student achieve- 
ment. 



In light of these considerations one might conclude that both the 
family background of the student and his school are important in promoting 
achievement. We might speculate for a moment on various avenues that 
might be fruitfully explored along these lines. 

These analyses show that the family-home background constellation of 
Socio-Economic Status, Pami3.y Structure and Racial-Ethnic group membership 
bear an important relationship to achievement, the multiple correlation 
being .91 when students are aggregated by schools and .60 when they are 
not aggregated by schools. This suggests that where family involvement in 
the child* s education is not present or is only weakly present, substantial 
gains in Achievement might be realized through a greater involvement of 
them in support of their child's education. 
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In considering the school as an avenue for promoting student achieve- 
ment it may be instructive to see what school variables such as the 
facilities, special programs, teacher* s training and experience, are re- 
lated to achievement. When we inspect these individual correlations we 
are impressed the low degree of relationship that exists. This indicates 
that small changes in just a few variables will not bring about substantial 
gains in achievement. Perhaps radical departures from existing practices 
will bring about these desired changes, at least we should give them a try. 
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